2026-03-01 — Provider Arsenal & Memory Upgrade

Session Overview

Major expansion of AI provider coverage, native memory system upgrade, and planning for sub-agent architecture. Shift from "get it working" to "get it optimized."

New Provider Keys Configured

All tested, saved to both the-ford-estate.env and ~/.openclaw/.env: - Mistral (MISTRAL_API_KEY) — 56 models, free tier, 60 RPM - DeepSeek (DEEPSEEK_API_KEY) — $0 balance (needs top-up) - Groq (GROQ_API_KEY) — free tier, very fast, low latency - DashScope China (ALIBABA_CLOUD_API_CN) — 161 models but models need activation - NVIDIA NIM (NVIDIA_API_KEY) — 40+ S+ tier models FREE - Cerebras (CEREBRAS_API_KEY) — fastest inference on earth, free - Codestral (CODESTRAL_API_KEY) — free coding-only Mistral models - Together AI (TOGETHER_API_KEY) — $25 free credit - SambaNova (SAMBANOVA_API_KEY) — free tier, fast inference - HuggingFace (HF_TOKEN) — valid, endpoint routing needs config

Provider Status

Provider	Status	Action Needed
DeepSeek	⚠️ $0 balance	Top up at platform.deepseek.com
xAI/Grok	⚠️ No credits	Purchase at console.x.ai
Z.AI	⚠️ Depleted	Top up at open.bigmodel.cn
DashScope	⚠️ Models need activation	Activate in Alibaba console
ElevenLabs	⚠️ Quota exhausted	Needs plan upgrade

Custom Providers Configured

Added to models.providers: - nvidia → integrate.api.nvidia.com/v1 (6 models: Kimi K2.5, GLM 5, DeepSeek V3.2, Qwen3 Coder, MiniMax M2.1, Llama 3.3) - deepseek → api.deepseek.com (2 models: chat, reasoner) - dashscope → dashscope.aliyuncs.com (6 Qwen/DeepSeek models) - codestral → codestral.mistral.ai (codestral-latest) - sambanova → api.sambanova.ai (Llama 3.3, DeepSeek V3) - together → api.together.xyz (3 models)

Model Strategy Defined

Tier 0 (Free): NVIDIA NIM, Cerebras, Groq, Codestral, SambaNova, OpenRouter Tier 1 (Cheap): Gemini Flash, Mistral Small, Together AI Tier 2 (Moderate): GPT-4o-mini, Mistral Medium, Gemini Pro Tier 3 (Premium): GPT-4o, o3, Opus (last resort)

Target: 80% of usage at $0, 15% pennies (Gemini Flash), 5% dollars (Opus only when needed)

Per-Agent Model Assignments

Agent	Primary	Fallback 1	Fallback 2	Complex
Ada	nvidia/kimi-k2.5	cerebras/gpt-oss-120b	google/gemini-2.5-flash	Opus subagent
K2	nvidia/deepseek-v3.2	codestral/codestral-latest	cerebras/qwen3-235b	Opus subagent
Cora	nvidia/kimi-k2.5	google/gemini-2.5-flash	mistral/mistral-medium	—
Winston	google/gemini-2.5-flash	nvidia/kimi-k2.5	groq/llama-3.3-70b	—
Synergy	google/gemini-2.5-flash	nvidia/kimi-k2.5	groq/llama-3.3-70b	—

Memory System Upgrade (Native)

Feature	Before	After
Embeddings	OpenAI	Gemini ($20x cheaper)
Search	Vector only	Hybrid BM25 + Vector
Diversity	Redundant hits	MMR re-ranking (λ=0.7)
Recency	Flat	Temporal decay (30d half-life)
Scope	Workspace only	+K2/Cora/Winston/Synergy/ada-lab
Sessions	Not searchable	Full transcript indexing
Caching	Off	50K entries

All configured via openclaw config set, validated, gateway restarted cleanly.

Model Scanner Cron

5am daily job configured
Scans OpenAI, Mistral, Groq, DeepSeek, DashScope for model changes
Tracks rate limit changes
First baselines captured: 120/56/20/2/161 models respectively
Reports findings to morning briefing

Free Provider Reference

Created comprehensive guide at memory/free-provider-reference.md: - 19 providers with free tiers identified - Priority signups: NVIDIA NIM, Cerebras, SambaNova, Codestral, HuggingFace, Cohere - Rate limits documented - Signup URLs compiled

Current Provider Arsenal

Working (11): OpenAI, Anthropic, Gemini, Mistral, Groq, OpenRouter, NVIDIA NIM, Cerebras, Codestral, Together, SambaNova Known Free (29 on OpenRouter): GPT-OSS-120B, Llama 3.3 70B, Qwen3 Coder, Hermes 405B, etc. Total accessible: 200+ models

Sub-Agent Architecture Research (Pending)

Problem: Domain-specific agents (K2, Cora, Winston, Synergy) all need similar functions (document retrieval, deep research) but currently no shared sub-agent layer.

Key Questions: 1. Shared service sub-agents vs domain-specific? 2. How to handle context passing between parent → sub-agent? 3. Tool access patterns (read-only vs read-write)? 4. Lifecycle: spawn → task → result → dispose vs persistent workers?

Sources to research: - GitHub starred repos (personal) - Reddit communities (r/MachineLearning, r/LocalLLaMA, r/OpenClaw, r/AutoGPT, r/CrewAI, r/ClaudeAI, r/ChatGPTCoding) - Free-coding-models library architecture - Agent-team-orchestration skill on ClawHub

Next Actions

[ ] Research sub-agent patterns from starred GitHub repos
[ ] Identify Reddit communities for daily monitoring
[ ] Define shared service sub-agent taxonomy
[ ] Top up DeepSeek ($5-10) for cheap reasoning tier
[ ] Configure HuggingFace endpoint routing
[ ] Set Telegram profile pics via @BotFather

Morning Update — 2026-03-01 09:30 EST

Sub-Agent Architecture Implementation

Created 4 disposable sub-agent workspaces: research, rag, coding, analysis
Minimal SOUL.md files (functional, no personality)
Spawn guide created at agents/SPAWN.md
Test research spawn in progress

Key Insight

2026 multi-agent trend: Specialized agents in sequence (extractor → analyzer → checker), not "one mega-bot." Frameworks: LangGraph (stateful), AutoGen, Conductor, Swarm.

Reddit Communities Ready

Community	Priority	Agent
r/OpenClaw	CRITICAL	Ada
r/CrewAI	HIGH	Ada
r/LocalLLaMA	HIGH	K2
r/homelab	HIGH	K2
r/realestate	HIGH	Cora
r/ClaudeAI	HIGH	K2/Ada
r/selfhosted	HIGH	K2
r/Proxmox	HIGH	K2
r/MachineLearning	MEDIUM	Ada
r/AutoGPT	MEDIUM	Ada

Set up Reddit monitoring crons (daily scanning)
Implement shared service sub-agents
Review GitHub starred repos (top 15 identified)

🕸️ Ada Research Browser